Bayesian exploration for approximate dynamic programming

نویسندگان

Ilya O. Ryzhov

Martijn R.K. Mes

Warren B. Powell

Gerald A. van den Berg

چکیده

Approximate dynamic programming (ADP) is a general methodological framework for multistage stochastic optimization problems in transportation, finance, energy, and other applications where scarce resources must be allocated optimally. We propose a new approach to the exploration/exploitation dilemma in ADP. First, we show how a Bayesian belief structure can be used to express uncertainty about the value function in ADP. Bayesian models can be integrated into both parametric and non-parametric value function approximations, which is vital for practical implementation. Second, we propose a new exploration strategy, based on the concept of value of information from the optimal learning literature, and prove that it systematically explores the state space. We evaluate this strategy using a variety of distinct resource allocation problems and demonstrate that it is highly competitive against other exploration strategies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Bayesian Reinforcement Learning

This paper proposes a simple linear Bayesian approach to reinforcement learning. We show that with an appropriate basis, a Bayesian linear Gaussian model is sufficient for accurately estimating the system dynamics, and in particular when we allow for correlated noise. Policies are estimated by first sampling a transition model from the current posterior, and then performing approximate dynamic ...

متن کامل

Cost Analysis of Acceptance Sampling Models Using Dynamic Programming and Bayesian Inference Considering Inspection Errors

Acceptance Sampling models have been widely applied in companies for the inspection and testing the raw material as well as the final products. A number of lots of the items are produced in a day in the industries so it may be impossible to inspect/test each item in a lot. The acceptance sampling models only provide the guarantee for the producer and consumer that the items in the lots are acco...

متن کامل

Approximate Incremental Dynamic Analysis Using Reduction of Ground Motion Records

Incremental dynamic analysis (IDA) requires the analysis of the non-linear response history of a structure for an ensemble of ground motions, each scaled to multiple levels of intensity and selected to cover the entire range of structural response. Recognizing that IDA of practical structures is computationally demanding, an approximate procedure based on the reduction of the number of ground m...

متن کامل

Cover tree Bayesian reinforcement learning

This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the...

متن کامل

Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach

We consider the problem of optimizing an expensive objective function when a finite budget of total evaluations is prescribed. In that context, the optimal solution strategy for Bayesian optimization can be formulated as a dynamic programming instance. This results in a complex problem with uncountable, dimension-increasing state space and an uncountable control space. We show how to approximat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Bayesian exploration for approximate dynamic programming

نویسندگان

چکیده

منابع مشابه

Linear Bayesian Reinforcement Learning

Cost Analysis of Acceptance Sampling Models Using Dynamic Programming and Bayesian Inference Considering Inspection Errors

Approximate Incremental Dynamic Analysis Using Reduction of Ground Motion Records

Cover tree Bayesian reinforcement learning

Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach

عنوان ژورنال:

اشتراک گذاری